6 research outputs found

    An integrated semantic-based approach in concept based video retrieval

    Get PDF
    Multimedia content has been growing quickly and video retrieval is regarded as one of the most famous issues in multimedia research. In order to retrieve a desirable video, users express their needs in terms of queries. Queries can be on object, motion, texture, color, audio, etc. Low-level representations of video are different from the higher level concepts which a user associates with video. Therefore, query based on semantics is more realistic and tangible for end user. Comprehending the semantics of query has opened a new insight in video retrieval and bridging the semantic gap. However, the problem is that the video needs to be manually annotated in order to support queries expressed in terms of semantic concepts. Annotating semantic concepts which appear in video shots is a challenging and time-consuming task. Moreover, it is not possible to provide annotation for every concept in the real world. In this study, an integrated semantic-based approach for similarity computation is proposed with respect to enhance the retrieval effectiveness in concept-based video retrieval. The proposed method is based on the integration of knowledge-based and corpus-based semantic word similarity measures in order to retrieve video shots for concepts whose annotations are not available for the system. The TRECVID 2005 dataset is used for evaluation purpose, and the results of applying proposed method are then compared against the individual knowledge-based and corpus-based semantic word similarity measures which were utilized in previous studies in the same domain. The superiority of integrated similarity method is shown and evaluated in terms of Mean Average Precision (MAP)

    A Threshold-Based Combination of String and Semantic Similarity Measures for Record Linkage

    No full text
    Since integrated data have got richer information, integration of different data sources is a key step in most data warehousing and mining projects. One of the principal challenges in integrating databases is duplication. In other words, in different databases, one entity may be available in different formats. Therefore, when these databases are combined, the availability of entities in different formats causes duplication. Record linkage is a technique which is used to detect and match duplicate records which are generated in data integration process. A variety of record linkage models with different steps have been developed in order to detect such duplicate records. For this purpose, string similarity measures are widely utilized for comparing record-pairs in different studies. However, in addition to string similarity, considering the semantic relatedness between two records can be also beneficial in the process of detecting duplicate records. This issue is not regarded in existing record linkage models. To determine the importance of semantic similarity in improving the effectiveness of detecting duplicate records, a similarity measure based on the combination of string and semantic similarity measures is proposed in this study. For combination purpose, a threshold-based method which considers the semantic similarity for each field of the dataset is proposed. This threshold determines the influence of semantic similarity in the final combination algorithm. The combined similarity measure is experimented on two real world datasets, namely Restaurant and Cora and its effectiveness is measured based on several standard evaluation metrics. As experimental results indicate, the combined similarity measure which is based on the combination of string and semantic similarity measures outperforms the string and semantic similarity measures, which are used individually, with the F-measure of 99.1% in Restaurant dataset, and 88.3% in Cora dataset. Therefore, based on the experimental results, semantic similarity should be taken into account in addition to string similarity in order to detect duplicate records more effectively in recork linkag

    A comparative study in classification techniques for unsupervised record linkage model

    No full text
    Problem statement: Record linkage is a technique which is used to detect and match duplicate records which are generated in data integration process. A variety of record linkage algorithms with different steps have been developed in order to detect such duplicate records. To find out whether two records are duplicate or not, supervised and unsupervised classification techniques are utilized in different studies. In order to utilize the supervised classification algorithms without consuming a lot of time for labeling data manually, a two step method which selects the training data automatically has been proposed in previous studies. However, the effectiveness of different classification techniques is the issue which should be taken into accounts in record linkage systems in order to classify records more accurately. Approach: To determine and compare the effectiveness of different supervised classification techniques in an unsupervised manner, some of the prominent classification methods are applied in duplicate records detection. Duplicate detection and classification of records in two real world datasets, namely Cora and Restaurant is experimented by Support Vector Machines, Naïve Bayes, Decision Tree and Bayesian Networks which are regarded as some prominent classification techniques. Results: As experimental results show, while Support Vector Machines outperforms with F-measure of 96.27% in Restaurant dataset, for Cora dataset, the effectiveness of Naïve Bayes is the best and it leads to an improvement with F-measure of 89.7%. Conclusion/Recommendation: The result of detecting duplicate records with different classification techniques tends to fluctuate depending on the dataset which is used. Moreover, Support Vector Machines and Naïve Bayes outperform other methods in our experiments

    High level semantic concept retrieval using a hybrid similarity method

    No full text
    In video search and retrieval, user’s need is expressed in terms of query. Early video retrieval systems usually matched video clips with such low-level features as color, shape, texture, and motion. In spite of the fact that retrieval is done accurately and automatically with such low-level features, the semantic meaning of the query cannot be expressed in this way. Moreover, the limitation of retrieval using desirable concept detectors is providing annotations for each concept. However, providing annotation for every concept in real world is very challenging and time consuming, and it is not possible to provide annotation for every concept in the real world. In this study, in order to improve the effectiveness of the retrieval, a method for similarity computation is proposed and experimented for mapping concepts whose annotations are not available onto the annotated and known concepts. The TRECVID 2005 data set is used to evaluate the effectiveness of the concept-based video retrieval model by applying the proposed similarity method. Results are also compared with previous similarity measures used in the same domain. The proposed similarity measure approach outperforms other methods with the Mean Average Precision (MAP) of 26.84% in concept retrieval
    corecore